DOCUMENT RESUME 



ED 413 349 


TM 027 657 


AUTHOR 


Land, Robert 


TITLE 


Moving Up to Complex Assessment Systems. Proceedings from 
the CRESST Conference (Los Angeles, CA, September 5-6, 
1996) . 


INSTITUTION 


National Center for Research on Evaluation, Standards, and 
Student Testing, Los Angeles, CA. 


SPONS AGENCY 


Office of Educational Research and Improvement (ED) , 
Washington, DC. 


PUB DATE 


1997-00-00 


NOTE 


25p. 


CONTRACT 


R305B60002 


AVAILABLE FROM 


UCLA Center for the Study of Evaluation, 10920 Wilshire 
Boulevard, Suite 900, Los Angeles, CA 90024-6511. 


PUB TYPE 


Collected Works - Proceedings (021) -- Collected Works - 
Serials (022) -- Reports - Evaluative (142) 


JOURNAL CIT 


Evaluation Comment; v7 nl pi -22 Sum 1997 


EDRS PRICE 


MFOl/PCOl Plus Postage. 


DESCRIPTORS 


Conferences; *Educational Assessment; Educational Research 
Educational Technology; Elementary Secondary Education; 
♦Evaluation Utilization; Models; Reliability; Research and 
Development; *Standards; *Test Use; Validity 


IDENTIFIERS 

ABSTRACT 


♦Center for Research on Eval Standards Stu Test CA 
Assessment systems to measure high educational standards 



emerged as the major theme at the 1996 conference of the National Center for 
Research ox\ Evaluation, Standards, and Scv.d:-.:ic Testincy (CR.lSST) , ''Mo ;^ing *Jp 
to Complex Assessment." This feature article, providing a summation of the 
proceedings of the conference, reports that the approximately 250 educators 
and community leaders were in general agreement that challenging standards 
are the key to the improvement of American education. In opening remarks, the 
codirectors of CRESST, Eva L. Baker and Robert L. Linn explained the 
conceptual model that is guiding CRESST assessment research and development 
in the next 5 years. This model focuses on the utility of assessment systems 
for various purposes and establishes long-range goals for the Center's 
research. The CRESST model highlights three qualities that are essential to 
the productive use of assessment: validity, fairness, and credibility. 
Conference presentations centered on broad areas related to these qualities: 
(1) developing valid, fair, and credible assessments; (2) enhancing the 
utility of assessments; and (3) exploring the role that technology can play 
in creating new possibilities in developing and using assessment systems. 

(SLD) 



******************************************************************************** 



Reproductions supplied by EDRS are the best that can be made 
from the original document . 



******************************************************************************** 




On 

rO 

rO 
1 — < 

B 



Evaluation Comment 

A Publication OF UCLA’s Center FOR THE Study OF Evaluation AND 
Graduate School of. Education & Information Studies 



National Center for Research on Evaluation,. Standards, and Student Testing 



Feature Article 



Summer 1997, Vol. 7, No. 1 



U.S. DEPARTMENT OF EDIJCATION 
Office of Educational Research and Improvement 

Z TIONAL RESOURCES INFORMATION 
CENTER (ERIC) 

document has been reproduced as 
received from the person or organization 
originating it. 

□ Minor changes have been made to 
improve reproduction quality. 



Moving Up to Complex Assessment Systems 

Proceedings From the 1996 CRESST Conference 



• Points of view or opinions stated in this 
document do not necessarily represent 
official OERI position or policy. 

J 



Robert Land UCLA/CRESST 



h- 

h- 

o 

K 



A ssessment systems to measure 
high educational standards emerged 
as the major theme at this year’s CRESST 
conference. Moving Up to Complex Assessment 
Systems, September 5-6, 1996, at UCLA’s Sun- 
set Village Conference Center. The broad appeal 
of the agenda was reflected by the diversity of 
the conference participants. Approximately 250 
researchers, community leaders, teachers, prin- 
cipals, school board members, state and federal 
education officials, and representatives of pri- 
vate and commercial interests attended two full 
days of presentations and assessment forums. 

Opinion was strong from many conference 
presenters that challenging standards were the 
backbone to the improvement of American 
education. 



Mailing List Update 



We will be updating the CRESST mailing 
list during the next few months. Postcards will 
be mailed to everyone who currently receives 
free copies of CRESST Line and Evaluation 
Comment. To remain on our list, please return 
the postcard promptly. 

Just a reminder that you or your associates 
may register to be placed on our publications 
mailing list at any time through our Web site, 
www.cse.ucla.edu. 

Thank you for your cooperation. 
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“I’d rather set the [standards] bar high,” 
said Sidney Thompson, superintendent of the 
Los Angeles Unified School District, “and have 
us look at how we’re going to help the student 
get over that bar, than set the bar low and know 
that when she or he got over it, it didn’t mean 
a darn thing." 

Thompson noted that the Los Angeles 
Unified School District has joined all 50 states 
and many large school districts in setting stan- 
dards for what children should know and be 
able to do across multiple grade levels and 
topics. The District and CRESST are working 
together to develop a new standards-based 
assessment system comprised of a commer- 
cial standardized test, performance-based as- 
sessments based on CRESST instructional 
models, and classroom tests to improve in- 
struction, learning, and student performance. 

“Our current emphasis on high, challeng- 
ing standards for all students,” said CRESST 
Co-director Eva Baker in her conference pre- 
sentation, “can be traced to the 1989 Gover- 
nors’ Education Summit and was reinforced 
in the 1994 Goals 2000 legislation and the 
recent Improving America’s Schools AcL 
which reauthorized federal Title I programs.” 

“By 1997-98," explained Baker, “Title I 
schools must have in place challenging con- 
tent and performance standards in at least 



’Improving America’s Schools Act of 1994, Confer- 
ence Report 103-761. Regarding Public Law 103- 
382, signed October 20, 1994, (pp. 6-33). Washing- 
ton, DC: House of Representatives. 



reading and mathematics, followed by high- 
quality assessments of those standards in the 
2000-2001 school year. The assessments 
must involve multiple approaches and mea- 
sure complex thinking skills and understand- 
ing of rigorous content. Performance 
assessments will be part of the system but 
pose some unique challenges in terms of the 
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time, costs and technical quality required to 
develop accurate measures of individual ac- 
complishment.” 

“One of our biggest challenges is to learn 
how performance assessments can work with 
traditional assessments,” noted CRESST Co- 
director Robert Linn in his opening confer- 
ence remarks, “and how these assessments 
fit into the larger education reform picture, 
from the classroom to the national level. We’ve 
moved from a focus on single instruments to 
a system perspective.” 

In other opening remarks, both Baker and 
Linn explained the new conceptual model that 
is guiding the CRESST assessment research 
and development efforts for the next five years. 
The CRESST model (Figure 1 ) focuses on the 
utility of assessment systems for various pur- 
poses and audiences and establishes three 
important, long-range social goals for the 
Center’s research: 



♦ providing new knowledge and 
understanding about educa- 
tional quality; 

♦ contributing to educational im- 
provement in policy, accounta- 
bility, and teaching and learning; 
and 

♦ encouraging productive public 
engagement in education. 



The CRESST model highlights three 
qualities that are essential to the productive 
use of assessment, namely, validity, fairness 
and credibility, and focuses the CRESST 
research agenda on understanding the 
relationships among and between these 
qualities and effective assessment systems. 
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Conference presentations focused on three 
broad areas related to the CRESST conceptual 
model: developing valid, fair, and credible as- 
sessments: enhancing the utility of assess- 
ments: and finally, the role that technology can 
play in creating new possibilities in both de- 
veloping and utilizing assessment systems. 

DEVELOPING VALID, FAIR, AND 
CREDIBLE ASSESSMENTS 

I N the past, researchers, educators, and 
policy makers seemed satisfied if 
assessments met relatively narrow cri- 
teria of technical quality. But as the pur- 
poses of assessment have grown and the 
demand for more inclusive and informa- 
tive tests has increased, both test devel- 
opers and test users have recognized that 
narrow technical criteria are not enough. 
Echoing themes from the CRESST model, 
conference presenters took a comprehen- 
sive view of what is required for good 
assessment; an expanded view of valid- 
ity, including attention to intended pur- 
poses and consequences: heightened 
concerns for inclusion of and fairness to 
all students: and recognition of the im- 
portance of public credibility. 



Please note that beginning with this issue of 
Evaluation Comments shall resume the use of 
volume and issue numbers. 



Validity 

T oday’s assessment systems are inten- 
ended to serve multiple audiences at 
the federal, state, local, classroom, and 
student levels and likewise are intended to 
serve a range of purposes, from communi- 
cating standards and promoting accountabil- 
ity, to contributing to school improvement, in- 
forming teaching and learning, and improv- 
ing student performance. These demands 
bring new complexity to assuring that assess- 
ment systems provide accurate information 
for decision making. Conference participants 
particularly highlighted three areas warrant- 
ing sustained effort: alignment, the measure- 
ment of progress, and linking the results from 
multiple measures. 



“. . .assessment is both a very cen- 
tral part of reform and the index for 
judging the success of that reform.” 



Alignment 

T hat assessment is to be aligned with 
rigorous standards for student 
achievement is a defining feature of 
today’s assessments. As Ed Reidy, deputy 
commissioner of the Kentucky Department of 
Education expressed it, “Assessment is both 
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a very central part of reform and the index for 
judging the success of that reform." Assess- 
ment is intended to stimulate reform by com- 
municating these standards, holding educa- 
tors and students accountable for achieving 
them, and to provide an accurate measure of 
students' performance on the standards. 

“It seems simple — adopt standards and 
make assessments that are aligned with 
them — but there is a lot more involved," noted 
Robert Linn. 

What does such alignment really mean? 
How do states, districts, and schools know 
whether their assessments are aligned? “How 
the major elements of an education system 
work together to guide the process of helping 
students achieve higher levels of mathemati- 
cal and scientific understanding," said Norman 
Webb, Wisconsin Center for Educational Re- 
search, “goes beyond a simple content analy- 
sis." 

Citing results of a study by the Council of 
Chief State School Officers (CCSSO), Webb 
reported that few states have addressed these 
questions with much rigor and even fewer have 
examined broader alignment issues. 
Contributing to the complex alignment picture 
is that states lack a formal and systematic 
process to develop assessments directly based 
on their standards, and instead have developed 
assessments prior to or at the same time as 
their standards. 



Based on his review of current practices 
and relevant literature, Webb presented five 
categories of criteria that states and local dis- 
tricts can use to evaluate alignment. 

These categories include: 

♦ pedagogical implications: 

♦ equity and fairness: 

♦ articulation across grades and 
ages: 

♦ system applicability: and 

♦ content focus. 



Content focus includes topic coverage, depth 
and range of student knowledge, and balance 
of representation. 

Several presenters reported that even 
when assessments are developed directly from 
standards, alignment can be complicated. 
David Wiley, technical director of the New 
Standards Project, summed up the general 
problem by stating that standards — even per- 
formance standards anchored in students’ 
work — are not specified well enough for pur- 
poses of test development. They do not ad- 
equately guide the concrete decisions that 
need to be made on what is to be measured, 
how it is to be measured, and what specific 
tasks and criteria will be used. 
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To develop the New Standards mathemat- 
ics assessments, Wiley found it was neces- 
sary to create an “infrastructure that would 
provide another level of construct definition.” 
The New Standards’ seven measurable math- 
ematics standards, for example, were clustered 
under three constructs: “concepts,” "skills,” 
and “problem solving”; these in turn were “de- 
fined in terms of student capabilities and the 
processes needed for successful task perfor- 
mance.” The constructs were used to assure a 
balanced test instrument and were the basis 
for standards-setting and reporting. 

Eva Baker recommended the use of an in- 
termediate strategy to provide a “crosswalk be- 
tween standards and assessments,” but one 
with the added advantage of providing gener- 
alizable assessment models that increase the 
cost-effectiveness and classroom utility of 
large-scale assessments. Based in cognitive 
theory, the CRESST models focus on core 
types of learning that recur across the curricu- 



Based in cognitive theory, the 
CRESST models focus on core 
types of learning that recur across 
the curriculum... 



lum: conceptual understanding, knowledge 
representation, problem solving, communica- 
tion, and team work. The models provide 
specifications for developing assessment tasks 



and scoring rubrics that are operationalized 
in specific subject areas and customized to lo- 
cal curriculum emphases and grade levels to 
be tested. Teachers can use the CRESST mod- 
els to create classroom instruction and assess- 
ment, and to align their practices with 
established standards and assessments. 

David Niemi, University of Missouri, and 
Zenaida Aguirre-Munoz, CRESST/UCLA, 
described the application of the CRESST 
models in Hawaii and elsewhere, where they 
have been used for assessing history and 
mathematics. Among the advantages Niemi 
and Aguirre-Munoz noted were improved 
replicability and comparability of tasks and 
results, enhanced system alignment, greater 
efficiency, and improved engagement of 
teachers and the public in all aspects of large- 
scale assessment. 

A number of conference participants 
stressed that aligning standards and assess- 
ments is only one piece of what’s required for 
the success of current reforms. Professional 
development, curriculum and instruction, in- 
centives and sanctions, resource allocation, 
district and schooi infrastructure — all these 
must be aligned with standards if real progress 
is to be made. 

Aligning standards and assessment with 
curriculum is essential if student learning is 
to be affected. But determining such alignment 
can be complex, as discussed by William 
Schmidt, Michigan State University. In his re- 
search on the Third International Mathemat- 
ics and Science Study (TIMSS), Schmidt 
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found that even deciding whether or not a 
country covered a particular mathematics topic 
turned out to be anything but simple. 

“From a curricular perspective,” Schmidt 
explained, “math is not math everywhere. 
There is little overall overlap in the countries 
we studied.” 

To create the assessment, it was neces- 
sary to develop sets of curricularly sensitive 
items that address areas where groups of 



“...math is not math everywhere. 
There is little overall overlap in the 
countries that we studied.” 



countries show overlapping curricula and a 
sophisticated methodology for characterizing 
curriculum. 

Measuring Progress 

T he challenge of accurately assessing 
student progress was highly salient to 
a number of conference participants 
who were struggling with Title I requirements. 
The key issue was the mandate that schools 
must show adequate yearly progress sufficient 
to enable all students to achieve high stan- 
dards of accomplishment within a reasonable 
period of time. Like the alignment of standards 



and assessment, accurately measuring 
progress is easier said than done. 

“From a validation point of view,” said Bob 
Linn, “we have to ask the question: How do 
we know when we see improvement?” 

Linn provided an example from Kentucky, 
where student achievement as measured by 
statewide assessments showed significant 
year-to-year increases while the National As- 
sessment of Educational Progress (NAEP) in- 
dicated little or no significant change. 

“There may be problems of test compara- 
bility from year to year,” explained Linn, 
“caused by differences in conditions of test 
administration or the degree of alignment be- 
tween the test and the state’s instructional 
goals.” Linn noted other factors that might ac- 
count for the increases in student achievement 
on the statewide test including the substan- 
tial incentives and sanctions associated with 
student test performance, possible changes 
in school populations, students becoming 
more familiar with the test format, teachers 
who changed instruction to match the state 
test content, or some technical design issue 
associated with performance assessment. 

Bengt Muthen, CRESST/UCLA, under- 
scored the difficult analytic challenges of mea- 
suring student progress by identifying some 
key problems including: 
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♦ selecting an analysis method 
that gives the best picture of 
performance; 

♦ analyzing and interpreting 

the interaction among multiple 
growth processes; 

♦ determining the size and dura- 
tion of instructional and other 
treatment effects; and 

♦ understanding the aggregate 
impact of various school experi- 
ences and programs and the con- 
tributions of individual student 
characteristics. 



Muth6n described CRESST’s research 
program to address these problems. Using 
longitudinal student performance data in a 
variety of skills areas, Muth6n is developing 
new, multilevel modeling tools to analyze 
student progress and the contributions of 
school, classroom, and other factors to such 
progress. In addition to providing new 
technical knowledge useful to researchers, this 
research should provide policy makers with 
practical guidance for formulating, reporting, 
and interpreting assessment results within and 
across schools. 



Linking 

L inking, a third validity challenge in 
current assessment systems, was also 
subject to wide discussion at the con- 
ference. States and local districts are devel- 
oping unique assessments based on their own 
standards. Yet their publics — parents, com- 
munity, policy makers, and students — still 
want to know how their students’ performance 
compares with that of others — from other lo- 
calities, other states, nationally, and interna- 
tionally. If students meet local or state stan- 
dards, does it mean they are nationally or in- 
ternationally competitive? What are the ground 
rules for linking results, especially across dif- 
ferent types of measures, such as norm-refer- 
enced tests and performance assessments? 

Conference speakers discussed a number 
of efforts to link results between the National 
Assessment of Educational Progress (NAEP), 
state-by-state NAEP assessments, and inter- 
national tests such as the Third International 
Mathematics and Science Study (TIMSS). 

For example, Sharif Shakrani, National 
Center for Education Statistics, presented a 
number of technical and practical challenges 
in making links that might permit states to 
compare their test results nationally and inter- 
nationally. 

“The market-basket approach appears to 
be one promising option,” explained Shakrani, 
“where states could administer representative 
samples of items drawn from the full set of 
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NAEP items along with their state assessments 
to get good estimates of how students in their 
state compare with students nationally.” 
Shankrani added that the market-basket ap- 
proach, applied to NAEP and state assessment, 
“would have the additional advantage of pro- 
viding rapid turnaround time, perhaps as 
quickly as three months, and would allow more 
frequent NAEP testing in more subjects.” 

In discussing his agenda for the National 
Center for Education Statistics, Commissioner 
Pascal Forgione endorsed research on the 
market-basket approach and efforts to link 
state, national, and international data. Fie also 
discussed plans to study the feasibility of 
embedding robust NAEP items in states’ non- 
NAEP assessments to generate more timely 
and cost-effective state-level NAEP scores. 
Further, Forgione revealed plans for linking 
NAEP with other national testing data to build 
a more comprehensive, cost-effective, and 
consistent database for policy makers and re- 
searchers. 



Fairness 

W HILE fairness is an integral compo- 
nent of validity, the CRESST model 
and conference participants identi- 
fied it as needing concerted attention in cur- 
rent assessment systems. Prompted in part 
by recent Title I legislation, which will aftect 
approximately 67% of schools in the United 
States, states and districts are increasingly in- 



terested in assessments that will contribute 
to educational improvements for all students, 
and they are committed to the testing of all 
students. 



“Results from a recent survey of 
Council schools. . .show that 85% of 
the Council districts have changed 
or are changing their assessments 
to align with new national, state, or 
local content standards.” 



“The challenge,” said Adrienne Bailey, 
senior consultant for the Council of the Great 
City Schools (CGCS), “is to get a// students 
to perform at higher levels than they have 
before.” 

But the urgency to improve schools 
underscores the need to make sure that the 
process is fair. “Results from a recent survey 
of Council schools,” said Bailey, “show that 
85% of the Council districts have changed or 
are changing their assessments to align with 
new national, state, or local content stan- 
dards.” Based on the volume of reform in pro- 
cess, Bailey emphasized that broad 
community involvement is essential in order 
to produce standards and assessments that 
are supported by diverse, urban communities 
and that help urban students to achieve world- 
class levels of performance. 
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“Equity and fairness are no longer simply 
issues of morality,” warned CRESST partner 
and Yale Professor Emeritus Edmund Gordon; 
“in educational measurement, they emerge as 
being at the very core of what our work is 
about." 

Ideally, the quality of inferences from as- 
sessments will be judged in terms of their ac- 
curacy and appropriateness for all people, of 
all backgrounds and needs. For this ideal to 
be approached, every aspect of the assessment 
process must be fair — from assessment de- 
velopment through administration and inter- 
pretation of results. However, although often 
heralded as a bridge to opportunity, testing and 
assessment have more often been viewed as 
unfair to underrepresented minorities and as 
barriers to educational access. 

“To help improve education for all stu- 
dents," added Gordon, “tests must go beyond 
telling how a student is performing, to giving 
useful information about how to improve that 
performance. The system must be committed 
to adapting to the diverse needs of all stu- 
dents.” 

Gordon identified four broad categories of 
fairness issues: 

♦ the political economy of 
educational assessment; 

♦ limitations in the political 
and technical capacities of 
pedagogy and assessment; 



♦ epistemological and 
theoretical contexts for 
educational assessment; 
and 

♦ the technological de- 
mands of equitable 
systems of assessment. 



“We must learn to factor into our peda- 
gogical and assessment practices the condi- 
tional and situational correlates of human 
performance,” said Gordon. “Why, for ex- 
ample,” he asked, “do Black students do bet- 
ter on tests when the test administrator is Black 
rather than White? To make fair tests we will 
have to change, expand, and achieve better 
symmetry among our concepts of knowledge, 
pedagogy, and intelligence. We will have to 
honor and accommodate diversity. And we will 
certainly have to move beyond traditional 
multiple-choice tests.” 

“This task,” concluded Gordon, “will 
require a strategic plan for assessment 
development, and the new CRESST model 
points us in that direction.” 

In addition to addressing issues of fair- 
ness for students who are currently included 
in testing, CRESST research is focusing on 
two groups who traditionally have been ex- 
cluded from large-scale assessments; students 
with disabilities and language minority stu- 
dents who are not fully proficient in English. 
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Students With Disabilities 

H ow many students with disabilities are 
currently excluded from testing? What 
are the characteristics of these stu- 
dents? What kinds of accommodations are 
currently being used? The answers to these 
key questions are far from clear according to 
Linda Bond, North Central Regional Educa- 
tional Laboratory. In a national survey of state- 



In general, anywhere from 5% to 
10% of all students were excluded 
depending on the particular assess- 
ment and the state. 



wide testing programs. Bond found tremen- 
dous variability in states’ estimates of how 
many students with disabilities were excluded 
from the state test. In general, anywhere from 
5% to 10% of all students were excluded de- 
pending on the particular assessment and the 
state. 

Similarly, Daniel Koretz, CRESST/RAND, 
cited U.S. Department of Education statistics 
indicating that states’ estimates of students 
with disabilities in their state vary widely, from 
5.5% to 15%, with an overall average of about 
10%. Estimates of specific disabilities such 
as mental retardation or learning disabilities 
are even more variable. 



“Differences among states,’’ Koretz sug- 
gested, “result mostly from differences in defi- 
nitions of students with disabilities, rather than 
real differences in where students with dis- 
abilities live.’’ 

Problems with counting and defining stu- 
dents with disabilities are not surprising given 
that states lack clear guidelines defining those 
students who should and should not be in- 
cluded in their assessments. James 
Ysseldyke, National Center on Educational 
Outcomes (NCEO), found that state guidelines 
on inclusion range from a single descriptive 
sentence to 60 pages of directions. To help 
states include more students with disabilities 
in their testing programs, NCEO is revising 
and clarifying their guidelines for inclusion 
practices. 

Good descriptive information and sound 
guidelines, however, may not be enough. 
Some districts and states are still reluctant to 
include students with disabilities for fear that 
low performance will hurt the child’s self- 
esteem and lower overall state scores. 

“Nearly 70% of fourth-grade students cur- 
rently excluded from state NAEP assessments 
could take those tests,’’ claimed presenter Fran 
Stancavage, American Institutes for Research. 
Stancavage urged increased adaptive testing 
strategies because current assessments are 
not providing the necessary information about 
the performance of the least able and the most 
advanced students. Ysseldyke agreed, observ- 
ing that “the most difficult measurement is- 
sues get addressed at the margins of the 
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distribution — very high and very low perform- 
ing students.” These factors suggest that as- 
sessment developers must make tests 
simultaneously more challenging and more ac- 
cessible. 

Accommodations will be a key to increas- 
ing accessibility, at least for some students. 
But deciding who needs accommodations, 
what kinds of accommodations are feasible, 
and whether accommodations create an un- 
fair advantage for test takers will be a major 
undertaking. Bond reported that most states 



“The purpose of accommodation 
from a measurement standpoint is 
to offset bias in order to make the 
measurement more valid than it 
would otherwise be.” 



have little problem modifying testing condi- 
tions for physically disabled students, offer- 
ing Braille tests for blind students, for example. 
But accommodations for cognitively disabled 
students are not as common; and the higher 
the stakes, the fewer the accommodations be- 
cause of test validity concerns. 

“The purpose of accommodation from a 
measurement standpoint,” responded Koretz, 
“is to offset bias in order to make the mea- 
surement more valid than it would otherwise 
be.” 



Discussing CRESST’s program of research 
on accommodations and adaptations for spe- 
cial needs students, Koretz described research 
on the use of paraphrasing for mildly men- 
tally retarded students taking Kentucky’s sci- 
ence tests. Issues Koretz is struggling with are 
the meaning of the scores obtained with and 
without accommodations and policies for as- 
sessing students whose classifications as dis- 
abled are ambiguous. 

Scott Trimble, Kentucky Department of 
Education, spelled out his state’s strong com- 
mitment to include all students in their state 
assessments. Kentucky districts and schools 
are advised to use the same accommodations 
in the assessment that they use in instruction. 

“Instructionally relevant accommoda- 
tions,” added Trimble, “such as the use of para- 
phrasing, extended time frames, and smaller 
group settings, seem to work fairly well.” 

But he cautioned that accommodations 
are not magic solutions to all problems asso- 
ciated with assessing students with disabili- 
ties. For example, students may receive special 
attention in the classroom one year, but not in 
another year. Trimble also questions whether 
year-to-year scores for students with disabili- 
ties are fair measures of their progress when 
one year they are tested with accommodations 
and another year they are not. 
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Language Minority Students 

L imited English proficient [LEP] students 
are more likely to be tested than stu- 
dents with disabilities,” said Charlene 
Rivera, George Washington University. “But 
LEP students are less likely to be given ac- 
commodations,” she added. As an example, 
Rivera reported that in 1 994, 1 7 states required 
students to pass a high school graduation test 
to earn a diploma. While 13 of the 17 states 
permitted accommodations, only 2 states of- 
fered tests specifically designed for LEP stu- 
dents. “Scores on tests that assume English 
proficiency are likely to grossly underestimate 
LEP students’ academic achievement,” sug- 
gested Rivera. 

“For purposes of accountability, improved 
teaching, and student learning,” said Lorrie 
Shepard, CRESST/University of Colorado at 
Boulder, “we need assessment systems that 
can identify student performance on relevant 
\continua of proficiency. For LEP students,” 
atided Shepard, “this requires multiple mea- 
sures that distinguish English language pro- 
ficiency, native language proficiency, and 
academic achievement. Such systems are tar 
from being available now. In particular, cur- 
rent measures of English language proficiency 
are few and limited, yet are essential to under- 
standing LEP students’ performance.” 
“Chicago and the entire state of Illinois 
are working intensively to develop good mea- 
sures of English language proficiency,” said 
presenter Carole Perlman, Chicago Public 



Schools. The state is developing large-scale 
assessments for LEP students in Grades 3-1 1 
that will be used in 1997. They will include 
multiple-choice reading tests and writing as- 
sessments with both textual and graphic 
prompts. “In the Chicago Public Schools,” 
added Perlman, “the focus is on giving teach- 
ers of bilingual classes the tools to develop 
and use performance assessments to docu- 
ment both native and English language 
achievement.” 

Efforts to develop appropriate accommo- 
dations for LEP students are just beginning. 
Translating tests into students’ native language 



... if students haven’t learned the con- 
tent in their native language, the value 
of a translated test is debatable. 



seems like a straightforward solution; and in- 
deed, the two states Rivera identified as offer- 
ing alternatives to their high school graduation 
tests use translations. But if students haven’t 
learned the content in their native language, 
the value of a translated test is debatable. “Fur- 
thermore,” Shepard warned, “linguistic and 
cultural differences make exact translation im- 
possible, casting doubt on the equivalency of 
scores such tests yield.” She called for more 
small-scale, focused research to provide good 
information for making accommodation deci- 
sions. Agreeing that translations do not meet 
the needs of all LEP students, Rivera nonethe- 
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less called for their increased use, at least in 
the short run, “especially for students who 
enter schools with a high degree of literacy in 
their native language.” 

Simplifying test language is another ac- 
commodation option. Based on his analysis 
of NAEP data, Jamal Abedi, CRESST/UCLA, 
cited evidence that complexly worded math- 
ematics questions depress LEP students’ 
scores. Abedi discussed CRESST research in- 
dicating that simplified wording significantly 
improved scores, at least for some students. 
Additional projects are underway to examine 
the effects of linguistic and other adaptations 
on limited English proficient and fully English 
proficient students of varying abilities. 

Using mixed-ability, collaborative work 
groups also may be an effective — and 
instructionally relevant — form of accommo- 
dation. During her research in classrooms with 
high LEP populations, presenter Noreen Webb, 
CRESST/UCLA, found that low-ability students 
scored better on performance assessments 
when they worked on similar tasks in groups 
with high-ability students prior to the test. 
“High-ability students’ scores were not af- 
fected,” said Webb. 

Although there are hints of promising di- 
rections, it is clear that there is much work to 
be done to develop valid, fair, and useful tests 
for all students. A complicating factor is that 
the work must be done with an eye toward mak- 
ing assessments that the public understands 
and trusts. 



Credibility 

T he most valid.and fair assessment sys- 
tems imaginable will fail if they lack 
public credibility,” said Eva Baker dur- 
ing her conference presentation. “Validity with- 
out credibility produces assessments that have 
no life span and whose findings are contended, 
diminished, or dismissed,” added Baker. 

Several presenters addressed the impor- 
tance of CRESST’s third prerequisite for util- 
ity, arguing that public communication and 
engagement are essential in establishing as- 
sessment system credibility. 

Secrecy, driven partly by the legitimate 
need for test security, has long been a trade- 
mark of the measurement community. But the 
public is increasingly reluctant to accept as- 
sessments — new or otherwise — on blind 
faith. As a result, many members of the as- 
sessment community have found that they 
need improved communication and public re- 
lations skills to complement their technical 
skills. 

Lorraine McDonnell, CRESST/University 
of California, Santa Barbara, addressed the 
broader social and political context, which de- 
mands better communication of assessment 
information. Citing recent polls, McDonnell 
noted that only 25% of the public trusts gov- 
ernment institutions to do the right thing all 
or most of the time, and only 25% of the vot- 
ers, who decide whether or not to support pub- 
lic education with taxes, even have children in 
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school. Consequently, new assessment sys- 
tems must survive in a context of mistrust and 
limited public understanding of education. “In 
this environment," McDonnell noted, “curricu- 
lar standards and assessments become the fo- 
cal point for many contested social values, not 
just about what is important to learn, but about 
how we define the good society and how those 
ideals should be passed on to succesive gen- 
erations.” Based on her research on the poli- 
tics of education reform in several states, 
McDonnell suggested several guidelines for 
making assessments politically and publicly 
credible. 



“...leadership has to come, at least 
partially, from people who are 
electorally accountable...” 



“First, where political will is lacking to 
make a needed long-term investment,” said 
McDonnell, “an incremental approach may ac- 
tually yield better results than a comprehen- 
sive approach." 

“Second, if a state decides to engage 
in substantial reform,” McDonnell argued, 
“strong political leadership is necessary. That 
leadership has to come, at least partially, from 
people who are electorally accountable,” 
added McDonnell, “not just from the educa- 
tion establishment and non-elected officials. 
Elected officials," she pointed out, “are in regu- 



lar contact with constituents, and the con- 
straint of their two- or four-year electoral cycle 
helps them bring a valuable, real-world per- 
spective to the process.” 

“Third,” McDonnell asserted, “the devel- 
opment of new curriculum standards and as- 
sessments cannot solely be a technical 
process with participation limited to experts. 

. . . Public participation in open, two-way dia- 
logues is very important,” McDonnell noted, 
“because it involves public deliberation about 
what skills and knowledge are most impor- 
tant for a productive life and active citizen- 
ship.” Acknowledging that building public 
consensus is a very difficult process, 
McDonnell warned that to avoid it would be 
to make a mockery of the notion of common 
standards. 

“Communicating a topic as complex as 
assessment,” said Leah Lievrouw, CRESST/ 
UCLA, “is a formidable challenge in any mod- 
ern, diverse society. . . . Despite improvements 
in mass communication, we live in an era of 
separation characterized by high levels of in- 
tra-group conversation and low levels of in- 
ter-group communication,” she added. 
Consequently, we may tend to target large 
media outlets while ignoring the types of 
smaller market electronic and print media tai- 
lored to many linguistic and ethnic minori- 
ties. “As a result,” explained Lievrouw, “our 
message may not get across to many of these 
important groups, and we need to rethink our 
media strategies.” 
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Richard Colvin, Los Angeles Times, em- 
phasized the crucial value in building public 
credibility, something that was not done by the 
California State Department of Education dur- 
ing the California Learning Assessment Sys- 
tem crisis. “CLAS was a forty-million-dollar 
mistake,” said Colvin, “not because it produced 
invalid or unfair results, but because of wide- 
spread public distrust caused by perceptions 
that there were weird things on the test.” Cit- 
ing the bunker mentality that led to the demise 
of CLAS, Colvin urged researchers and test 
makers to engage in open and understandable 
communication with the public. He offered sev- 
eral tests that he feels assessments must pass 
to achieve public credibility: 

♦ The "barber chair" test. Ordinary 
people should be able to discuss the 
assessment in ordinary social situations. 
Assess familiar content that the public thinks 
children need to get along in the world. 

♦ The "realtor" test. Test scores affect 
housing values and, consequently, influence 
homeowners’ support for local schools. Report 
scores in a format simple enough that realtors 
can use them as a closer. 

♦ The "newspaper" test. Newspapers 
have limited space for even the most important 
stories. Give us results that we can fit into two 
or three columns. 



Challenging the research community to 
make testing understandable to the general 
public, Colvin advised that “there is a pace 
you can walk at that the public can follow, or 
you can run out ahead and lose everybody.” 

McDonnell’s, Lievrouw’s, and, particularly, 
Colvin’s remarks challenged — even nettled — 
some of the conference attendees, but the 
themes were strongly endorsed by other 
CRESST presenters who have been working 
hard to build public support for high standards 
and improved assessments. Moreover, these 
presenters universally argued that credibility 
and successful implementation was not pos- 
sible without active, broad-based, public par- 
ticipation. 

In a roundtable presentation, four state 
education officials — Duncan MacQuarrie, 
Washington; Wayne Martin, Colorado; Doris 
Redfield, Virginia; and Catherine Smith, Michi- 
gan — identified key constituencies that must 
be formally included in the process for any 
state-level reform to be successful. The list 
includes teachers, students, parents, school 
administrators, school board members, higher 
education officials and admissions officers, 
education researchers, state legislators, the 
governor, representatives of the business com- 
munity, and the media. Presenters noted that 
teachers are almost always deeply involved in 
the process from the beginning, but that other 
groups, particularly school administrators and 
parents, should be involved more directly and 
earlier than usual. 
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Echoing Colvin’s advice, these presenters 
also stressed the importance of making cer- 
tain that standards and assessments include 
items that represent the public’s general un- 
derstanding of a subject and what’s important, 
for example, a computation standard in math- 
ematics or a grammar/spelling standard in 
writing. 

Discussing district-level reform, Los An- 
geles Board of Public Education President 
Mark Slavkin identified several keys in build- 
ing and maintaining credibility at the district 
level. 



Slavkin identified jargon as a 
major threat to credibility... 



“To keep public and political credibility,” 
said Slavkin, “it is necessary to keep one foot 
in the old [norm-referenced assessments] as 
we move to the new standards-based perfor- 
mance assessments.” 

Slavkin identified jargon as a major threat 
to credibility, recommending that everything 
be publicly disclosed in the process of devel- 
opment, and emphasizing the importance of 
keeping the media updated about progress 
along the way. 

But foremost, in Slavkin’s opinion, is “buy- 
in" from the very beginning by parents, teach- 
ers, and community members. One example 
of a successful, if not always smooth, effort to 



get such commitment has been a three-year 
project to develop language arts standards, 
curriculum, and performance assessments in 
the Los Angeles Unified School District. De- 
scribed by Charlotte Higuchi, a CRESST part- 
ner and LAUSD teacher, this project involved 
a large and diverse group of teachers, par- 
ents, and community members whose input 
shaped the reform from the beginning. 

Los Angeles Unified School District Su- 
perintendent Sidney Thompson also empha- 
sized the importance of buy-in at the school 
level. “The people who have to do it at the 
school site have to own it; they have to be- 
lieve it can be done, and if they believe that 
and bring the parents into it, then we have a 
pathway to get us there.” 

UTILITY 

V ALIDITY, fairness, and credibility are 
necessary, but not sufficient, for 
system utility. Because assessment 
systems usually serve multiple purposes and 
users, often with different and competing 
needs, it becomes very difficult to design a 
system useful for all purposes and people. 
Randy Bennett, Educational Testing Service, 
expressed concern that the utility of a com- 
plex, multipurpose assessment system would 
be similar to the Swiss Army Knife, service- 
able in a pinch for all sorts of jobs from re- 
moving screws to opening cans, but not ideal 
for any one of them. 
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Like a sensitive ecosystem, the utility of 
an assessment system will likely be greatest 
when the individual components are 
harmonious with the processes that link them. 
If any process or component is disrupted, the 



“...if you don’t give us good infor- 
mation about what works, and soon, 
those of us responsible for imple- 
menting assessment reforms may 
well perish.” 



entire system suffers. Unfortunately, the 
political environment in which assessment 
exists is volatile and urgent as Judith Billings, 
then superintendent of education for the state 
of Washington, noted. 

“You [academicians] may publish,” said 
Billings, “whether new assessments work or 
not. But if you don’t give us good information 
about what works, and soon, those of us 
responsible for implementing assessment 
reforms may well perish.” 

In Washington, as in virtually all states, one 
of the purposes of assessment reform has been 
to change teaching and, thus, to improve 
student learning. Many conference presenters 
this year focused on an element central to the 
CRESST assessment model, teacher capacity 
building and teachers’ changing instructional 
practices as a result of changes in assessment 
methods. 



Issues in Improving Instruction 
AND Teacher Capacity Building 

T eachers are a key to the credibility of 
assessment reform and essential to its 
success,” said presenter Sid Thompson. 
But Marilyn Monahan, secretary-treasurer of 
the National Education Association, warned 
against the assumption that teachers will be 
able to immediately embrace reform. Stan- 
dards-based assessment reform demands 
change in instructional practices. Monahan 
stated that teachers welcome this, but that those 
who want reform must invest in teacher knowl- 
edge through professional development. 

“The journey from what takes place at this 
conference to what takes place in teachers’ 
classrooms is long, complex, and unpredict- 
able,” cautioned Monahan. 

Agreeing with Monahan, CRESST partner 
Hilda Borko found in her research with teach- 
ers and students that teachers were interested 
in assessment and instructional reform, but felt 
they didn’t have the time or expertise to de- 
velop alternative assessments on their own. 
Borko found that teachers initially inserted per- 
formance assessment into their instruction; that 
is, the new assessments were added into an 
already busy instructional program. While 
some teachers began to incorporate these as- 
sessments into their ongoing instruction by the 
end of the first year of the reform effort, it often 
was not until the third year of the project, with 
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help from Borko and others, that teachers were 
able to integrate assessment and instructional 
reforms into their daily classroom activities. 

Maryl Gearhart and Megan Franke, 
CRESST/UCLA, reported on action research 
projects focusing on assessment reform in 
mathematics. The integration of assessment 
into instruction, they argued, is essential to 
the realization of mathematics reform envi- 
sioned by the National Council of Teachers of 
Mathematics standards. Gearhart and Franke’s 
research showed that teachers need deeper un- 
derstanding of mathematics and children’s 
mathematical reasoning in order to implement 
new pedagogies at more than a superficial 
level. For example, although many teachers 
participating in their projects began to ask chil- 
dren to share their thinking, only some teach- 
ers were able to probe with specific and 



The journey from what takes place at 
this conference to what takes place 
in teachers’ classrooms is long, com- 
plex, and unpredictable... 



substantive questions or to guide analytical 
discussions of student problem-solving strat- 
egies. 

“Classroom instruction and performance 
assessment should be inseparable aspects of 
the education experience for language minor- 
ity students,” urged Richard Dur^n, CRESST/ 



University of California, Santa Barbara. He rec- 
ommended that teachers gather multiple forms 
of evidence of student performance — class- 
room tests, graded projects, student self-as- 
sessments, and videotapes revealing students’ 
fluency with learning tools. 

Agreeing with Dur^n, Thomas Romberg, 
University of Wisconsin, Madison, argued that 
“teachers need multiple assessment strategies 
to accurately measure student performance,” 
adding that “in mathematics, teachers need 
to know if students can add, subtract, multi- 
ply and divide, and if they can put these piece- 
meal skills together to solve routine and 
nonroutine problems.” 

“But teachers also need to know if students 
can explain why an answer is correct,” added 
Romberg, “and how students navigate the 
problem-solving process.” He suggested cre- 
ating situations that enable teachers to gather 
information from listening to students’ expla- 
nations, observing students at work, and ex- 
amining the products of their work. 
Large-scale assessment can signal students’ 
level of performance, but teachers need con- 
tinuous, detailed evidence of students’ learn- 
ing processes and understanding to fully 
support the multiple assessment process. 

That large-scale, standards-based perfor- 
mance assessments are not typically designed 
to meet the information needs of the class- 
room teacher was also noted by Phil Daro, 
director for assessment development for the 
New Standards Project. Daro recommended 
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that one method to help teachers understand 
what standards demand of their students is to 
make performance assessments look like well- 
constructed teacher-tests. 

“Teachers do standards-based instruction 
and performance testing all the time,” said 
Dam. “The trick is to clarify the standards and 
assessments sufficiently for teachers to 
recognize the parallels with their own practice,” 
added Dam, “but not so much that the true 
complexity of the reform effort is lost.” 

“We need to use large-scale performance 
assessment systems to define standards and 
focus school-level efforts on student work 
linked to standards,” agreed Lynn Winters, 
Long Beach (CA) Unified School District. She 
emphasized the need to engage teachers in 
continuing professional activities to enable 
them to implement effective, ongoing, 
standards-based classroom instruction and 
assessment. 

“Professional development is the hardest 
sell in the reform marketplace,” admitted 
Catherine Smith, Michigan State Department 
of Education, but she added that it was a vital 
component to the success of any reform effort. 

USING TECHNOLOGY TO 
CREATE NEW POSSIBILITIES 

A t least a few of the major problems 
presented by complex assessment 
systems might be resolved by improve- 
ments in technology according to several con- 



ference presenters. Clearly, technology can 
help with managing, linking, and disseminat- 
ing assessment results. But technology also 
may permit test developers to increase authen- 
ticity and open up the range of modalities and 
systems of representation used for assess- 
ment. 

“Future generations of tests will need to 
tap nontraditional constructs, base test 
designs on cognitive principles, and increase 
the diversity of problems types,” noted Randy 
Bennett, Educational Testing Service. In' spite 
of current logistic problems, Bennett predicted 
that large-scale assessments would soon 
include computer-based presentations of 
problem types not possible with paper-and- 
pencil tests. Bennett shared multimedia 
prototype items using historical speeches and 
newscasts to illustrate the potential of 
presenting and asking students to respond to 
“dynamic stimuli.” 

Ron Stevens, CRESST/UCLA, demon- 
strated the use of neural network technology 
to permit real-time assessment of complex 
problem solving. In one of Stevens’ prototypes, 
medical students were presented with realis- 
tically sketchy information about a patient’s 
symptoms, a set of diagnostic tests that they 
could order, and a “library” of reference mate- 
rials. As the students worked through the op- 
tions presented by the computer program, their 
choices were recorded and could be compared 
with patterns of hypotheses generated by ex- 
pert diagnosticians investigating the same 
problem. 
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Based on lessons learned from his efforts 
to develop computer-based assessments of 
group and teamwork processes, Harold O’Neil, 
CRESST/University of Southern California, 
noted a number of problems endemic to 
technology projects. These include the costs 
and time for software design and development, 
inadequacies of existing telecommunications 
technology, unavailability of sophisticated 
technologies in public schools, and the 
complexity and cost of maintaining test 
security. O’Neil also shared the substantial 
progress his group has made in using 
technology to measure the quality and quantity 
of individual contributions and group problem 
solving. He pointed particularly to the potential 
of collaborative concept mapping where 
students work in teams through networked 
technology to create and revise concept maps. 
Collaborative concept mapping makes 
possible the real-time assessment and 
reporting of deep understanding and teamwork 
performance — an example of a potentially 
useful and cost-effective near-term application 
of technology. 

CONCLUSION 

I N their closing remarks to the 1996 
CRESST conference, Eva Baker and 
Robert Linn acknowledged the formidable 
challenges ahead. Among them are assuring 
system validity: supporting the alignment be- 
tween standards and assessments: promoting 



fairness: addressing the technical chal- 
lenges of measuring progress and linking dif- 
ferent assessments to address the needs and 
purposes of many audiences at multiple lev- 
els: improving schools’ and teachers’ capac- 
ity: and productively engaging the public. 
"These are all priorities for the research com- 
munity,” said Baker and Linn. “They will call 
on the best of our technical skills along with 
very considerable sociopolitical prowess.” 

"I think a key lesson of the past two days,” 
concluded Baker, "is that we must approach 
the assessment challenges we’ve discussed 
through greater collaboration. It’s clear that we 
share a commitment to improve education. 
Let’s move forward together.” 



1997 CRESST Conference 

September 4 through September 5, 
1997 

Sunset Village Conference Center 
UCLA Campus 

Registration materials and complete details 
will be available in the Summer 1997 CRESST 
Line and on the CRESST World Wide Web 
site, www.cse.ucla.edu. Anyone requiring 
early registration may contact Mary Wilby, 
CRESST/UCLA, 10920 Wilshire Blvd., Ste. 
900, Los Angeles, CA 90024: e-mail: 
mary@cse.ucla.edu; phone; (310) 206-1532. 
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The following CRESST reports are 
available by calling Kim Hurst, (310) 206- 
1532, or sending a message to Kim at: 
kim@cse.ucla.edu. 

Reforming Schools by Reforming As- 
sessment: Consequences of the Arizona 
Student Assessment Program (ASAP): 
Equity and Teacher Capacity Building 

Mary Lee Smith 

CSE Technical Report 425, 1997 ($9.00) 

In this study, Mary Lee Smith and other 
researchers focused on how schools changed 
as a result of state-mandated standards and 
assessments. 

The Politics of State Testing: Imple- 
menting New Student Assessments 

Lorraine McDonnell 
CSE Technical Report 424, 1997 ($5.00) 
Lorraine McDonnell continues her synthe- 
ses of innovative state assessment programs 
in Kentucky (Kentucky Instructional Results In- 
formation System), California (California 
Learning Assessment System), and North 
Carolina. 

Teachers’ Developing Ideas and Prac- 
tice About Mathematics Performance 
Assessment: Successes, Stumbling 
Blocks, and Implications for Profes- 
sional Development 



Hilda Borko, Vicky Mayfield, Scott Marion, 

Roberta Fiexer, and Kate Gumbo 

CSE Technical Report 423, 1997 ($3.00) 

This study focuses on the change process 
experienced by a group of third-grade teach- 
ers as they implemented mathematics perfor- 
mance assessments in their classrooms. 
Based on workshop conversations and inter- 
views between teachers and the research/staff 
development team throughout a single school 
year, the team reached five major conclusions. 

New Writing Assessments: The Chal- 
lenge of Changing Teachers’ Beliefs 
About Students as Writers 

Shelby Wolf and Mary! Gearhart 
CSE Technical Report 422, 1 997 ($3.00) 
During a two-year collaboration with el- 
ementary school teachers. Wolf and Gearhart 
examined ways that teachers’ beliefs about 
their students as writers mediated their invest- 
ment in new methods of assessing students’ 
writing. 

Teachers’ Beliefs About Assessment 
and Instruction in Literacy 

Carribeth Bliem and Kathryn Davinroy 
CSE Technical Report 421 , 1 997 ($3.00) 
Bliem and Davinroy further investigate 
teachers’ beliefs about assessment and its 
connection to instruction in literacy. Most of 
the data were drawn from transcripts of bi- 
weekly meetings between the research team 
and third-grade teachers using performance 
assessments in their classrooms. 
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The Politics of Assessment: A View 
From the Political Culture of Arizona 

Mary Lee Smith 

CSE Technical Report 420, 1 996 ($3.00) 
Mary Lee Smith traces the events of the 
Arizona Student Assessment Program (ASAP), 
an innovative multiple assessment program 
that grew out of discontent with mandated 
standardized testing in Arizona. 

Implications of the OECD Comparative 
Study of Performance Standards for 
Educational Reform in the United 
States 

Eva L. Baker 

CSE Technical Report 419, 1996 ($3.00) 

In this report, Eva Baker explores the im- 
plications for education reform in the United 
States of an OECD study of performance stan- 
dards. Using a general model of educational 



reform, Baker analyzes the meaning of perfor- 
mance standards in the United States, ad- 
dressing key influences of tradition, diversity, 
control, and participation. 

Assessment and Instruction in the Sci- 
ence Classroom 

Gail Baxter, Anastasia Elder, and Robert Glaser 
CSE Technical Report 41 8, 1 996 ($2.50) 
Findings from this study of fifth-grade stu- 
dents provided further evidence that critical 
differences exist between students who think 
and reason well with their knowledge and 
those who do not. 



CSE Technical Reports may also be found on 
the CRESST Web site: www.cse.ucla.edu. 
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